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When Doesn’t Cokriging 
Outperform Kriging? 
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Abstract. Although cokriging in theory should yield smaller or equal 
prediction variance than kriging, this outperformance sometimes is 
hard to see in practice. This should motivate theoretical studies on 
cokriging. In general, there is a lack of theoretical results for cokriging. 
In this work, we provide some theoretical results to compare cokriging 
with kriging by examining some explicit models and specific sampling 
schemes. 
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Genton and Kleiber (2015) provided an excellent 
review of recent development in the mutivariate co- 
variance functions. In many situations, the ultimate 
objective of modeling the multivariate covariance 
function is to obtain superior prediction through 
cokriging. In theory, cokriging should have a pre¬ 
diction variance no larger than that of the kriging 
prediction. However, as the authors point out in the 
paper, sometimes the improvement of cokriging is 
very little or none. In this note, we try to shed some 
light through some theoretical investigations. 

For univariate Gaussian stationary processes, we 
now have a good understanding of the proper¬ 
ties of kriging and statistical inferences. For ex¬ 
ample, theoretical results have been established to 
justify (i) that two different covariance functions 
may yield asymptotically equally optimal predic¬ 
tion (Stein, 1999), and (ii) some parameters are 
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not consistently estimable if the spatial domain is 
bounded (Zhang, 2004). We know the conditions un¬ 
der which a misspecified covariance function yields 
an asymptotically right prediction and can exploit 
this fact to simplify computations (Zhang, 2004; 
Du, Zhang and Mandrekar, 2009). 

We lack the analogous understanding for the mul¬ 
tivariate spatial models. There are no explicit theo¬ 
retical results to answer the following questions: 

• How important is the cross-covariance function? 
Specifically, could two different multivariate co- 
variance functions yield an asymptotically equally 
optimal prediction? 

• Which parameters are important to cokriging? We 
know which parameters are important to kriging. 

• How much improvement does cokriging have over 
kriging? 

One particular concept that has been shown useful 
in the study of kriging is the equivalence of probabil¬ 
ity measures due to a theorem established by Black- 
well and Dubins (1962). Let s*, i = 1,... ,n be sam¬ 
pling sites on a fixed domain (area) where the pro¬ 
cess Y (s) is observed, and {sj, z > n} be a set of sites 
on the same domain where Y is to be predicted. If 
the two Gaussian measures Pi and P 2 are equivalent 
on the (T-algebra generated by Y{si),i = 1 , 2 ,..., 
then with Pi-probability one, 

sup|Pi{A|y(sj),z = l,...,n} 

-P 2 {A|y(s,),z = l,...,n}| 
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where the supremum is taken over A G a{Y (sj), i > 
n}. The above result implies that the linear pre¬ 
dictions under the two measures are asymptotically 
equally optimal (Stein, 1999). 

This result can be readily extended to the mul¬ 
tivariate spatial process and therefore implies two 
cokriging predictors are asymptotically equally op¬ 
timal under the two probability measures if the two 
Gaussian measures are equivalent. However, unlike 
in the univaritate case, there are very limited re¬ 
sults on equivalence of probability measures. Ruiz- 
Medina and Porcu (2015) gave some general condi¬ 
tions for equivalent measures for multivariate Gaus¬ 
sian processes though there is still a lack of explicit 
examples where equivalent measures occur. 

We now provide some sufficient conditions for the 
equivalent of Gaussian measures for a particular bi¬ 
variate model. Let Y(s) = (li(s), 12 ( 8 ))^ be a sta¬ 
tionary bivariate Gaussian process with the follow¬ 
ing bivariate covariance function under the proba¬ 
bility measure Pk, A: = 1,2, such that 


Cij{h) = Gov(yi(s),Yj(s-Fh)) 


= M{\h\,aij^k,cik,iy), i,j = 1,2, 

where ,a,v) denotes the Matern covariance 

function with variance ci^, scale parameter a and the 
smoothness parameter v. The following are sufficient 
conditions for the two measures Pk to be equivalent 
on the cr-algebra generated by {^^(s), s G H, i = 1,2} 
for some bounded set D £ d<3: 


2u 
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( 1 ) 


0'12,i/v^0W(i0^2(T= <^ 12 , 2 / 11 , 2 (^ 22 , 2 - 


To prove this claim, we employ the Karhunen- 
Loeve expansion under measure Pi. Since the two 
processes {Yi{s )/i = 1,2, have the same co- 
variance function M{\\i\,a,a,v) and therefore pos¬ 
sess the same Karhunen-Loeve expansion under 
measure Pi, 




i=i 


where for i = 1,2, {Zii,l = 1,...} consists of i.i.d. 
standard normal random variables under measures 
Pi. Clearly, the eigenvalues A/ and eigenfunctions 
fl{s) only depend on the correlation function and 
hence do not depend on i. In addition, 

Zii = [ Yi{s)fi{s)ds. 

Y JD 


Using the above expression, it is not hard to show 
that 

Pl{^Z\iZ2m) — 

( 2 ) _ 

for r = 0 - 12 ,i/^cJii, 10 - 22 , 1 , 

(3) E2[ZiiZ2m) = rE2{ZiiZim)- 

The Karhunen-Loeve expansion implies that 
{Zii,l = 1,2,..., 00 } is a basis of the Hilbert space 
generated by {yi(s),s G P} with respect to mea¬ 
sure Pi. Hence, {Zii,Z 2 i,l = 1,2,...} is a basis of 
the Hilbert space generated by the two processes 
{l}(s),i = l,2,s G P}. The two measures are equiv¬ 
alent on the Hilbert space if and only if they are so 
on {Zii, Z 2 i,i = 1,2,...} (Ibragimov and Rozanov, 
1978, page 72). To show the equivalence of the 
two measures, we only need to verify (Stein, 1999, 
page 129) 

2 2 00 00 

EEEE {El{ZiiZjm) — E2{ZiiZjm))^ 

i=l 7=1 /=1 m=l 

( 4 ) 

< 00 . 

Because conditions (1) imply that the two measures 
are equivalent on {yi(s),s G P} (Zhang, 2004), we 
must have 

OO OO 

EE (Pl(ZjzZim) - E 2 {ZiiZim)f‘ < 00 , i = 1 , 2 . 

1=1 m=l 

For i/j, equations (2) and (3) imply 

OO OO 

EE {El{ZiiZ2m) — E2{ZiiZ2m))^ 

1=1 7n=l 

00 00 

-^EE {El{ZiiZim,) — E2{ZiiZim))‘^ < OO. 

1=1 m.=l 

Therefore, (4) is proved and so is the sufficiency of 
the conditions. We now have an explicit example 
where two different bivariate covariance functions 
yield asymptotically equal cokriging results. 

Next, we will try to explain why sometimes it 
is hard to see the improvement of cokriging over 
the kriging prediction. Consider a bivariate Gaus¬ 
sian process with mean 0 and exponential covariance 
functions such that 


( 5 ) 


Cijih) = Cov(yi(s),L}(s-hh)) 

= cryexp(-a|h|), i,j = 1,2. 
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Assume the two processes are observed at n points 
Sj,i = and predict hi(0). Write Yi = 

(Yi(si),i = Y 2 = (Y 2 (sj),i = l,...,re)'. It 

is known that in this case the cokriging predictor is 
identical to the kriging predictor. To see this, let R 
denote the correlation matrix of Yi, which is also 
the correlation matrix of Y 2 . Then 


Cov(Yi, Yj) =aijR. 

Let V be the matrix with (i,j)th. element aij. Then 
the covariance matrix of (Yi,Y 2 ) is V ® R. Let k 
denote the vector of correlation coefficients between 
Yi(s), the variable to be predicted, and Yi. Then 

^(Yi(s)|Yi,Y 2 ) 

( 6 ) 

= ((Til, U22) ® k')(Y-^ 0 R-^)Y 
= ((k', 0) ® R-^)Y = k'i?-^Yi 

(7) 

= L;(Yi(s)|Yi). 

Therefore, cokriging is identical to kriging and we 
should not expect any improvement of cokriging 
over kriging. We can also show that they are identi¬ 
cal if Y 2 (s) is observed at a subset of locations where 
Yi is observed. 

One scenario where cokriging might outperform 
kriging is when the auxiliary variable is observed 
at more locations than the predicted variable. In 
the next example, we will examine analytically 
what variables affect the improvement of cokriging 
over kriging. We assume the same bivariate model 
(5) and Y 2 {s) are observed at s € O = {ijn^i = 
± 1 ,± 2 ,... ,±n}, but Yi(s) is observed at half of the 
points s G Oi = {2i/n, i = ±1, ±2,..., ±n/2} where 
n is an even integer. Denote the kriging predictor 
and cokriging predictor of Yi(0) by 

( 8 ) W(0) = .E(Yi(0)|Yi(s),sgOi), 

(9) 11(0) = .E(yi(0)|Yi(s),s G Oi, Y 2 (t),i e O). 


We will derive the following asymptotic relative ef¬ 
ficiency of kriging to cokriging: 


( 10 ) 


n^oo^(y^(0) -yi(0))2 


where r is the correlation coefficient of Yi(s) and 

y2{s). 

The asymptotic relative efficiency of kriging pre¬ 
diction does not depend on the scale parameter a. 
Intuitively this is understandable. However, for a fi¬ 
nite sample size n, a may affect the efficiency. We 


now present a simulation study to see how a and r 
affect the relatively efficiency of kriging prediction. 
We consider the exponential covariance model with 
Til = T 22 = 1 and r = 0.2 and 0.5, and 0 = 2,4 and 
8 . The auxiliary variable Y 2 is observed at 
z = 1 ,..., n, but the primary variable Yi is observed 
at ±z/n for even integers 0 < z < n. We calculate 
the prediction variance for predicting Yi(0) using 
both kriging and cokriging and obtain the relative 
efficiency of kriging for different n, a and r. 

Figure 1 plots the relative efficiency for different r, 
a and n. We see that the relative efficiency of krig¬ 
ing decreases as n increases, which means that it is 
more likely to see the outperformance of cokriging 
over kriging when n is larger. When the spatial au¬ 
tocorrelation is strong (i.e., a smaller), the asymp¬ 
totic efficiency is achieved relatively faster (i.e., with 
n not too larger). This agrees with many other inhll 
asymptotic results. 

We now prove (10). We first note a Marko¬ 
vian property of the exponential model established 
by Du, Zhang and Mandrekar (2009), which says 
£'(Yi(s)|Yi(s), s G B) only depends on the two near¬ 
est neighbors of s in a finite set B such that s is 
between the minimum and the maximum elements 
of B (Du, Zhang and Mandrekar, 2009, Lemma I). 
Also from the lemma, we obtain 

F;(Yi( 0) - Yi(0))^ = 2cT?ia/zz + o(n-2). 

In the extreme case when r = 1, we can view the 
process Yi(s) being observed at O. Then in this ex¬ 
treme case, the above equation implies 

H(Yi(0) - Yi(0))^ = al^a/n + o(n-2). 

The ratio in (10) is clearly 1/2. Hence, we have ver¬ 
ified (10) for this extreme case. On the other hand, 
when r = 0, the two predictors Yi(0) and Yi(0) are 
identical and ( 10 ) is obviously true. 

We are going to show that 

Yi(0) = 6iYi(-2/zz) + 62 Yi( 2 /n) 

( 11 ) +h^Y2{-2/n) + h^Y2{-l/n) 

+ hY2{l/n) + hY2{2/n), 

where 

g- 2 «/n 

bi-h- ^; 

( 12 ) 

^Q—2aln 

63 = &6 = -^_4„/„^^, 

'po OL j Th 

(13) = —-—-. 
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r=0.5 
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Fig. 1. Relative efficiency of kriging to cokriging for different r, a and n. The solid horizonal line is the asymptotic relative 
efficiency 1 — r^/2. 


Some straightforward calculation yields 

£;(yi(o)-Yi(o))2 

+ g-4a/n _ g-2a/n _ 

/((e-4a/n^l)(e-2a/n^l)) 

= crfi{2 — r‘^)aln + o(n“^). 

Then (10) immediately follows. Hence, it is suf¬ 
ficient to show (11). It is possible to show that 
Yi(0) — hi(0) is uncorrelated with any Yi(s), s G Oi 
and with any Y 2 {t),t G O. Hence, Ti(0) must be 
the best linear prediction. Here we take an alter¬ 
native but more intuitive approach. We will ap¬ 
ply the Markovian property of the Gaussian expo¬ 
nential model to show that hi(0) only depends on 
yi(-2/n), yi(2/n), H2(-2/n), H2(-l/n), y2(l/n) 
and l2(2/n). Consequently, the coefficients 6j’s in 
(12) and (13) can be found by solving linear equa¬ 
tions. 

For any odd integer i between —n and n, 

E{Y2{i/n)\Yi{s)ffi £ Oi,Y2{t),t£ 0,t^iln) 

= E{E{Y2{i/n)\Yi{t),Y2{t),t£0,t^i/n)\ 
Yi(s),s G Oi,t £ 0,t ^ i/n} 

(14) 

= E{E{Y 2 {i/n)\Y 2 {t),t£ 0,t^ i/n)\ 

Yi{s),s £ Oi,t £ 0,t ^ i/n} 

= E{Y2{i/n)\Y2{U_),Y2{U+)}, 


where ti- and are the two nearest neighbors of 
i/n in O. For example, for i = —1, ti- = —2/n and 
ii+ = 1/n. 

Define e* = Y 2 {iln) - E{Y 2 {i/n)\Y 2 {ti-),Y 2 {ti+)} 
for an odd i. Then is independent of li(s), s G Oi 
and Y 2 {t),t £ O and t^ijn. Consequently, 

£(yi(o)|yi(s),sGOi,y2(t)TeC)) 

= £(yi (0) I yi (s), y 2 (s), s G Oi, ei, i odd) 

(15) 

= F;(yi(o)|yi(s),y2(s),sGOi) 

-I- E{Yi{Q)\ei,i odd). 

The first term in the above equation depends only 
on yi(—2/re) and yi(2/re) due to the Markovian 
property. For the second term, because the cross¬ 
covariance function is proportional to the covariance 
function of Y 2 {t), we have 

E{Yi{0)\ei,i odd) = rE(Y2{0)\ei,i odd). 

Applying again the property of conditional expecta¬ 
tion and the Markovian property, we get 

F;(y2(0)|ei,iodd) 

= E{E{Y2i0)\Y2{t),t£O}\e^,i odd) 

= l3E{Y2{-l/n) -I- y 2 (l/re)|ei,i odd} 

= l3E{Y2{-l/n) + y2(l/re)|e_i,ei}, 

where /3 is the constant in £'(y 2 ( 0 )|y 2 (—1/re), 
y 2 (l/re)) = /3(y2(—1/re) -|- y 2 (l/re)), and the last 
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equation follows the fact that Cj is independent to 
y(l/n) and Y 2 {—'\-/n) if i 7 ^ 1 or —1. Therefore, the 
second term of (15) is a linear function of e_i and ei 
and hence a linear function oiY 2 {i/n), z = — 2 ,— 1,1 
and 2 . 
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